Optenemos los datos del S&P-500¶

optenemos las nombres de las empresas que conforman el S&P-500 de wikipedia

In [2]:
sp500 = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')[0]
sp500["Symbol"]=sp500["Symbol"].map(lambda x: x.replace(".","-"))
sp500_symbols = sp500["Symbol"].to_list()

Una vez optenidos estos nombres de las empresas que conforman el S&P-500, usamos el API de Yahoo Finance para optener sus datos, desde el año 2000 hasta el año 2021

In [3]:
df_sp500 = yf.download(tickers = sp500_symbols, start = "2000-01-01", 
                              end = "2021-12-31", interval = "1d")
sp500_symbols = df_sp500["Close"].columns.to_list()
df = df_sp500.sort_index()
df.head()
[*********************100%***********************]  503 of 503 completed

1 Failed download:
- CEG: Data doesn't exist for startDate = 946702800, endDate = 1640926800
Out[3]:
Adj Close ... Volume
A AAL AAP AAPL ABBV ABC ABMD ABT ACN ADBE ... WYNN XEL XOM XRAY XYL YUM ZBH ZBRA ZION ZTS
Date
2000-01-03 00:00:00 44.221336 NaN NaN 0.854541 NaN 3.016691 18.25000 9.347849 NaN 16.274672 ... NaN 2738600.0 13458200.0 582300.0 NaN 3033493.0 NaN 1055700.0 1199600.0 NaN
2000-01-04 00:00:00 40.843319 NaN NaN 0.782495 NaN 2.810732 17.81250 9.080766 NaN 14.909397 ... NaN 425200.0 14510800.0 317700.0 NaN 3315031.0 NaN 522450.0 816100.0 NaN
2000-01-05 00:00:00 38.309811 NaN NaN 0.793945 NaN 3.028807 18.00000 9.064076 NaN 15.204173 ... NaN 500200.0 17485000.0 1188000.0 NaN 4642602.0 NaN 612225.0 1124700.0 NaN
2000-01-06 00:00:00 36.851116 NaN NaN 0.725238 NaN 3.258996 18.03125 9.381233 NaN 15.328290 ... NaN 344100.0 19461600.0 534300.0 NaN 3947658.0 NaN 263925.0 1112100.0 NaN
2000-01-07 00:00:00 39.922035 NaN NaN 0.759592 NaN 3.683030 17.93750 9.481391 NaN 16.072985 ... NaN 469500.0 16603800.0 1401000.0 NaN 6063647.0 NaN 333900.0 782000.0 NaN

5 rows × 3018 columns

Calculamos Algunas Metricas:¶

  • Gap Return
  • IntraDay Return
  • Variation Per Day
  • Trailing Volatility Over Time
In [5]:
gap_returns = np.log(df["Open"]/df["Close"].shift(1))
intraday_returns = np.log(df["Close"]/df["Open"])
df_variation =  df["Adj Close"].pct_change()
df_volatility=df_variation.rolling(250).std()*100*np.sqrt(250)

Cual es el mejor día para invertir?¶

  • teniendo en cuenta el retorno de los movimiento gap
  • teniendo en cuenta el retorno de los movimientos intradiarios
In [6]:
weekday = gap_returns.index.map(lambda x: x.weekday())

best_day=pd.concat([
    gap_returns.groupby(weekday).mean().T.mean().rename("Gap_return mean"),
    gap_returns.groupby(weekday).std().T.mean().rename("Gap_return std"),
    
    intraday_returns.groupby(weekday).mean().T.mean().rename("IntraDay_return mean"),
    intraday_returns.groupby(weekday).std().T.mean().rename("IntraDay_return std"),
    
    df_volatility.groupby(weekday).mean().T.mean().rename("Volatility"),
],axis=1)

best_day.reset_index(inplace=True)
best_day["Date"] = best_day["Date"].map({0:"Mon",1:"Tue",2:"Wed",3:"Thu",4:"Fri"})
best_day.rename(columns={"Date":"Day"},inplace=True)
best_day
Out[6]:
Day Gap_return mean Gap_return std IntraDay_return mean IntraDay_return std Volatility
0 Mon 0.000089 0.012936 -0.000298 0.019721 32.949131
1 Tue 0.000649 0.011787 0.000122 0.019740 32.913444
2 Wed 0.000166 0.011617 0.000226 0.020011 32.908567
3 Thu 0.000041 0.011965 0.000558 0.020366 32.877352
4 Fri 0.000288 0.012269 0.000276 0.018750 32.765478
In [7]:
fig, axs = plt.subplots(1,2, figsize=(20,5))
sns.barplot(x=best_day["Day"],y=best_day["Gap_return mean"],ax=axs[0]);
axs[0].set_title("Mean Gap Return per Day of the Week");
sns.barplot(x=best_day["Day"],y=best_day["IntraDay_return mean"],ci=best_day["IntraDay_return std"],ax=axs[1]);
axs[1].set_title("Mean IntraDay Return per Day of the Week");
  • El Mejor dia para invertir segun el retorno GAP es Martes
  • El Mejor dia para invertir segun el retorno Intra Diario es Jueves

Cuales fueron los momentos de alta volatilidad que afectaron al S&P-500?¶

In [8]:
Vix = df_volatility.T.mean().dropna()
fig = px.line(x=Vix.index, y=Vix, title="S&P-500 Volatility Over Time",labels=dict(x="Date", y="Volatility"))
fig.show()

Vemos que en los años 2000, ya vienen de una alta volatilidad, probablemente causada por la Burbuja de las .com.

En en 2007 vuelve a haber una alta volatilidad en el S&P-500, debido a la Crisis de las hipotecas subprime, hasta fines del 2010.

Finalmente el ultimo momento de alta volatilidad en el mercado fue en 2020, devido a la pandemia mundial causada por el Covid19

Ahora analizaremos un poco las empresas (y sectores) que conforman el S&P-500.¶

In [9]:
df_perCompany=pd.DataFrame( sp500[['Symbol', 'GICS Sector']])
df_perCompany.rename(columns={"Symbol":"Ticker"},inplace=True)
In [10]:
for ticker in sp500_symbols:
    df_adjClose_ticker=df["Adj Close"][ticker].dropna()
    if df_adjClose_ticker.shape[0]==0:
        continue
    year_index = df_adjClose_ticker.index.map(lambda x: x.year)

    first_close, last_close = df_adjClose_ticker.iloc[[0,-1]]
    total_return = (last_close/first_close)-1
    first_year = df_adjClose_ticker.index[0].year
    last_year = df_adjClose_ticker.index[-1].year

    years=last_year-first_year+1
    returnPerYear=[]
    for year in range(first_year,last_year+1):
        first_close_year, last_close_year = df_adjClose_ticker[year_index==year].iloc[[0,-1]]
        year_return= (last_close_year/first_close_year)-1
        returnPerYear.append(year_return)
    mean_return_per_year = np.mean(returnPerYear)
    volatility = np.std(returnPerYear)
    df_perCompany.loc[df_perCompany["Ticker"]==ticker,["years","total_return","mean_return_per_year","volatility"]]=years,total_return,mean_return_per_year,volatility
    
In [11]:
df_perCompany
Out[11]:
Ticker GICS Sector years total_return mean_return_per_year volatility
0 MMM Industrials 22.0 5.588992 0.104859 0.196200
1 AOS Industrials 22.0 33.993191 0.199614 0.257195
2 ABT Health Care 22.0 13.970998 0.146395 0.180032
3 ABBV Health Care 9.0 4.608987 0.240326 0.229127
4 ABMD Health Care 22.0 18.838904 0.314810 0.580944
... ... ... ... ... ... ...
498 YUM Consumer Discretionary 22.0 27.333428 0.182262 0.177138
499 ZBRA Information Technology 22.0 22.875070 0.196867 0.308217
500 ZBH Health Care 21.0 3.691292 0.106400 0.264423
501 ZION Financials 22.0 0.618679 0.079462 0.352188
502 ZTS Health Care 9.0 7.405851 0.281776 0.167108

503 rows × 6 columns

Calculamos nuestro propio Ratio

$$ MyRatio = \frac{totalReturn}{years}*\frac{meanReturnPerYear}{volatility - RF} $$

$ RF = 0.01/255 $

In [13]:
Rf = 0.01/255
df_perCompany["Return_Volatility_Ratio"] = (df_perCompany["mean_return_per_year"]*df_perCompany["total_return"])/((df_perCompany["volatility"]-Rf)*df_perCompany["years"])
In [14]:
df_perCompany
Out[14]:
Ticker GICS Sector years total_return mean_return_per_year volatility Return_Volatility_Ratio
0 MMM Industrials 22.0 5.588992 0.104859 0.196200 0.135801
1 AOS Industrials 22.0 33.993191 0.199614 0.257195 1.199401
2 ABT Health Care 22.0 13.970998 0.146395 0.180032 0.516507
3 ABBV Health Care 9.0 4.608987 0.240326 0.229127 0.537230
4 ABMD Health Care 22.0 18.838904 0.314810 0.580944 0.464063
... ... ... ... ... ... ... ...
498 YUM Consumer Discretionary 22.0 27.333428 0.182262 0.177138 1.278647
499 ZBRA Information Technology 22.0 22.875070 0.196867 0.308217 0.664219
500 ZBH Health Care 21.0 3.691292 0.106400 0.264423 0.070740
501 ZION Financials 22.0 0.618679 0.079462 0.352188 0.006346
502 ZTS Health Care 9.0 7.405851 0.281776 0.167108 1.387850

502 rows × 7 columns

Cuales son las 9 mejores empresas para invertir¶

In [15]:
top9_companies=df_perCompany.sort_values(by="Return_Volatility_Ratio",ascending=False)[0:9]
top9_companies
Out[15]:
Ticker GICS Sector years total_return mean_return_per_year volatility Return_Volatility_Ratio
322 MNST Consumer Staples 22.0 1060.729174 0.538903 0.935435 27.777751
352 ODFL Industrials 22.0 366.419743 0.341535 0.348595 16.319983
330 NFLX Communication Services 20.0 510.597445 0.665004 1.127123 15.063184
443 TSLA Consumer Discretionary 12.0 223.014230 1.055824 2.039080 9.623143
347 NVDA Information Technology 22.0 329.243697 0.597267 0.943422 9.474912
245 IDXX Health Care 22.0 163.686694 0.285135 0.243441 8.716008
448 TSCO Consumer Discretionary 22.0 270.747039 0.409297 0.692645 7.272673
176 EPAM Information Technology 10.0 47.081430 0.525872 0.343443 7.209812
376 POOL Consumer Discretionary 22.0 139.707631 0.289053 0.272894 6.727335
In [17]:
fig, axs=plt.subplots(1,1,figsize=(15,5))
sns.barplot(x=top9_companies["Ticker"],y=top9_companies["Return_Volatility_Ratio"]);
In [18]:
fig = px.sunburst(top9_companies, path=['GICS Sector', 'Ticker'], values='Return_Volatility_Ratio',
                  color='total_return')
fig.show()

Cuales son las mejores industrias que pertenecen al S&P-500¶

In [19]:
df_perSector=df_perCompany.groupby("GICS Sector").mean()
Rf = 0.01/255
df_perSector["Return_Volatility_Ratio"] = (df_perSector["mean_return_per_year"]*df_perSector["total_return"])/((df_perSector["volatility"]-Rf)*df_perSector["years"])
df_perSector.sort_values("Return_Volatility_Ratio",ascending=False,inplace=True)
df_perSector
Out[19]:
years total_return mean_return_per_year volatility Return_Volatility_Ratio
GICS Sector
Consumer Staples 20.545455 41.317334 0.132782 0.215313 1.240415
Consumer Discretionary 19.810345 33.109084 0.243166 0.439972 0.923787
Health Care 20.203125 28.735543 0.224978 0.378442 0.845642
Communication Services 16.961538 29.852878 0.153494 0.345736 0.781482
Industrials 19.328767 21.458846 0.190819 0.280363 0.755725
Information Technology 18.648649 24.796384 0.250625 0.441870 0.754240
Real Estate 21.032258 16.081849 0.198249 0.341831 0.443505
Materials 18.964286 13.308978 0.161402 0.305811 0.370441
Financials 20.621212 11.283377 0.144718 0.295260 0.268225
Utilities 21.607143 9.062460 0.134937 0.228246 0.248000
Energy 19.809524 8.060907 0.156099 0.402859 0.157689
In [20]:
fig, axs=plt.subplots(1,1,figsize=(30,8))
sns.barplot(x=df_perSector.index, y=df_perSector["Return_Volatility_Ratio"]);
In [21]:
min_ratio=df_perCompany["total_return"].min()
max_ratio=df_perCompany["total_return"].max()
total_return_scale = (df_perCompany["total_return"]+1-min_ratio)/(max_ratio-min_ratio)

fig = px.sunburst(df_perCompany, path=['GICS Sector',"Ticker"], values=total_return_scale,
                  color='volatility')
fig.show()